Learning document category descriptions through the extraction of semantically significant phrases

نویسنده

Bruce Krulwich

چکیده

This paper discusses an intelligent agent that learns to identify documents of interest to particular users, in a distributed and dynamic database environment with databases consisting of mail messages, news articles, technical articles, on-line discussions, client information, proposals, design documentation, and so on. The agent interacts with the user to categorize each liked or disliked document, uses significant-phrase extraction and inductive learning techniques to determine recognition criteria for each category, and routinely gathers new documents that match the user's interests. We present the models used to describe the databases and the user's interests, and discuss the importance of techniques for acquiring high-quality input for learning algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

Extracting descriptions of problems with product and services from twitter data

There is enough evidence that social media contains timely information that businesses could use to their benefits. In this paper we discuss automatic extraction of descriptions of problems from twitter data. More specifically we present a system that filters tweets related to an enterprise and extracts descriptions of problems with their product/service. First step of this extraction process i...

متن کامل

A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network

It is a fundamental and important task to extract key phrases from documents. Generally, phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction, we suggest exploring the Wikipedia knowledge to model a document as a semantic network, where both n-ary and binary relationships amon...

متن کامل

Extraction of Significant Phrases from Text

Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This pape...

متن کامل

Computing Science Group Learning to Extract Significant Phrases from Text

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Learning document category descriptions through the extraction of semantically significant phrases

نویسنده

چکیده

منابع مشابه

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Extracting descriptions of problems with product and services from twitter data

A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network

Extraction of Significant Phrases from Text

Computing Science Group Learning to Extract Significant Phrases from Text

عنوان ژورنال:

اشتراک گذاری